The following content has been provided by the University of Erlangen-Nürnberg.
Welcome everybody, the final week in the semester.
Professor Hornegger is not here, not today, not tomorrow.
He's in China again.
So I will talk to you a little bit about boosting this week.
Is everybody registered to the exam so far?
Okay.
So what is boosting?
We will first give a short definition.
Boosting is the idea that you have classifiers that don't perform very well.
These are called weak classifiers, and you combine these weak classifiers,
a whole bunch of weak classifiers, and at the end you end up with a pretty strong classifier.
And one of the algorithms is called AdaBoost.
So we will have a look at the AdaBoost algorithm.
And some years after its invention, one thought about the theory behind it,
and one realized that it's a very well-known model.
It's an additive model with an exponential loss function.
So we will talk about this additive model and about this loss function.
And finally, we will apply this algorithm for face detection.
So this algorithm, it's very powerful.
It was introduced 20 years ago.
And the core idea is that you combine many, many weak classifiers.
And AdaBoost was invented in 1997 by Freund and Shapire.
And it's one of the most popular boosting algorithms.
So a short definition of what a weak classifier actually is.
It's a classifier that performs pretty bad.
And the error rate is only slightly better than random guessing.
So if you have two classes, a two-class problem with equal prior probabilities,
then random guessing would give you an error rate of 50%.
So you will apply several weak classifiers, and you will apply them sequentially.
You start with one classifier, and then you change the rates of your data, your training data.
And you say the samples that you misclassified get a higher rate
and are more important for the classifier in the next stage.
So this will give you a sequence of classifiers.
And at the end, you compute a weighted majority vote to get the final prediction.
So each classifier has its own decision,
and you combine all these decisions by a weighted majority vote.
So let's assume we have, again, a two-class problem,
and our class labels are either plus one or minus one.
And we have a set of predictor variables, and we have n samples,
and we have a classifier G, and this classifier outputs plus or minus one.
And we can compare the output of the classifier with the actual class label of the sample.
And here we use the indicator function.
The indicator function is one if the argument is true, and it's zero if the argument is false.
So if the output of the classifier is identical to the class label,
then the indicator function here is zero.
And if it doesn't match, then the indicator function is one.
So we count here all the misclassified samples.
The number of misclassified samples divide it by the total number of samples.
That's our error rate.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:49:19 Min
Aufnahmedatum
2013-02-04
Hochgeladen am
2013-02-07 12:10:21
Sprache
en-US